AITopics

Country: Europe (0.46)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.36)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.36)

Neural Information Processing SystemsFeb-19-2026, 06:23:33 GMT

958c530554f78bcd8e97125b70e6973d-Supplemental.pdf

matrix, stationary distribution, transition matrix, (13 more...)

Country:

Europe > United Kingdom > England (0.04)
Asia > Middle East > Israel (0.04)
Asia > Japan (0.04)
(4 more...)

Genre: Workflow (0.46)

Industry:

Government (1.00)
Health & Medicine (0.68)
Banking & Finance (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Neural Information Processing SystemsFeb-9-2026, 01:23:37 GMT

Supplementary Material for A TISS: Autoregressive Transformers for Indoor Scene Synthesis Despoina Paschalidou

A TISS with various transformer models that consider ordering.

category, machine learning, natural language, (18 more...)

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Canada > Ontario > Toronto (0.14)
Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)
North America > United States > California > Santa Clara County > Sunnyvale (0.04)

Genre: Research Report (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

van der Vaart, Pascal R., Yorke-Smith, Neil, Spaan, Matthijs T. J.

Priors Matter: Addressing Misspecification in Bayesian Deep Q-Learning

arXiv.org Artificial IntelligenceSep-1-2025

Uncertainty quantification in reinforcement learning can greatly improve exploration and robustness. Approximate Bayesian approaches have recently been popularized to quantify uncertainty in model-free algorithms. However, so far the focus has been on improving the accuracy of the posterior approximation, instead of studying the accuracy of the prior and likelihood assumptions underlying the posterior. In this work, we demonstrate that there is a cold posterior effect in Bayesian deep Q-learning, where contrary to theory, performance increases when reducing the temperature of the posterior. To identify and overcome likely causes, we challenge common assumptions made on the likelihood and priors in Bayesian model-free algorithms. We empirically study prior distributions and show through statistical tests that the common Gaussian likelihood assumption is frequently violated. We argue that developing more suitable likelihoods and priors should be a key focus in future Bayesian reinforcement learning research and we offer simple, implementable solutions for better priors in deep Q-learning that lead to more performant Bayesian algorithms.

likelihood, machine learning, reinforcement learning, (16 more...)

2508.21488

Country: Europe > Austria (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Neural Information Processing SystemsAug-16-2025, 03:58:49 GMT

A Additional details regarding

GANs trained by a two time-scale update rule converge to a local Nash equilibrium.

artificial intelligence, machine learning, natural language, (16 more...)

Country:

Europe > United Kingdom > England (0.04)
Asia > Middle East > Israel (0.04)
Asia > Japan (0.04)
(4 more...)

Genre: Workflow (0.46)

Industry:

Government (1.00)
Health & Medicine (0.68)
Banking & Finance (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Neural Information Processing SystemsAug-14-2025, 20:57:54 GMT

64986d86a17424eeac96b08a6d519059-Supplemental.pdf

category, computer vision and pattern recognition, sceneformer, (14 more...)

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Canada > Ontario > Toronto (0.14)
Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)
North America > United States > California > Santa Clara County > Sunnyvale (0.04)

Genre: Research Report (0.46)

Industry: Appliances & Durable Goods (0.31)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Artificial IntelligenceMay-9-2025

Is there a half-life for the success rates of AI agents?

Ord, Toby

Building on the recent empirical work of Kwa et al. (2025), I show that within their suite of research-engineering tasks the performance of AI agents on longer-duration tasks can be explained by an extremely simple mathematical model -- a constant rate of failing during each minute a human would take to do the task. This implies an exponentially declining success rate with the length of the task and that each agent could be characterised by its own half-life. This empirical regularity allows us to estimate the success rate for an agent at different task lengths. And the fact that this model is a good fit for the data is suggestive of the underlying causes of failure on longer tasks -- that they involve increasingly large sets of subtasks where failing any one fails the task. Whether this model applies more generally on other suites of tasks is unknown and an important subject for further work.

agent, artificial intelligence, time horizon, (14 more...)

2505.05115

Country: Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.59)

Yang, Adam, Chen, Chen, Pitas, Konstantinos

Just rephrase it! Uncertainty estimation in closed-source language models via multiple rephrased queries

arXiv.org Artificial IntelligenceJun-16-2024

State-of-the-art large language models are sometimes distributed as open-source software but are also increasingly provided as a closed-source service. These closed-source large-language models typically see the widest usage by the public, however, they often do not provide an estimate of their uncertainty when responding to queries. As even the best models are prone to ``hallucinating" false information with high confidence, a lack of a reliable estimate of uncertainty limits the applicability of these models in critical settings. We explore estimating the uncertainty of closed-source LLMs via multiple rephrasings of an original base query. Specifically, we ask the model, multiple rephrased questions, and use the similarity of the answers as an estimate of uncertainty. We diverge from previous work in i) providing rules for rephrasing that are simple to memorize and use in practice ii) proposing a theoretical framework for why multiple rephrased queries obtain calibrated uncertainty estimates. Our method demonstrates significant improvements in the calibration of uncertainty estimates compared to the baseline and provides intuition as to how query strategies should be designed for optimal test calibration.

large language model, machine learning, natural language, (18 more...)

2405.13907

Country:

Europe > France > Auvergne-Rhône-Alpes > Isère > Grenoble (0.04)
Europe > United Kingdom > England > Bristol (0.04)
Asia > Singapore (0.04)

Genre: Research Report (0.64)

Industry: Health & Medicine (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Neural Information Processing SystemsMar-12-2024, 18:59:06 GMT

Parameter Learning for Log-supermodular Distributions

log-partition function, maximum likelihood, submodular function, (15 more...)

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.36)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.36)

Lv, Outongyi, Zhou, Bingxin

LLQL: Logistic Likelihood Q-Learning for Reinforcement Learning

arXiv.org Artificial IntelligenceDec-13-2023

Modern reinforcement learning (RL) can be categorized into online and offline variants. As a pivotal aspect of both online and offline RL, current research on the Bellman equation revolves primarily around optimization techniques and performance enhancement rather than exploring the inherent structural properties of the Bellman error, such as its distribution characteristics. This study investigates the distribution of the Bellman approximation error through iterative exploration of the Bellman equation with the observation that the Bellman error approximately follows the Logistic distribution. Based on this, we proposed the utilization of the Logistic maximum likelihood function (LLoss) as an alternative to the commonly used mean squared error (MSELoss) that assumes a Normal distribution for Bellman errors. We validated the hypotheses through extensive numerical experiments across diverse online and offline environments. In particular, we applied the Logistic correction to loss functions in various RL baseline methods and observed that the results with LLoss consistently outperformed the MSE counterparts. We also conducted the Kolmogorov-Smirnov tests to confirm the reliability of the Logistic distribution. Moreover, our theory connects the Bellman error to the proportional reward scaling phenomenon by providing a distribution-based analysis. Furthermore, we applied the bias-variance decomposition for sampling from the Logistic distribution. The theoretical and empirical insights of this study lay a valuable foundation for future investigations and enhancements centered on the distribution of Bellman error.

bellman error, gumbel, logistic distribution, (15 more...)

2307.02345

Country:

Asia > China > Shanghai > Shanghai (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Asia > Thailand > Bangkok > Bangkok (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)